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1 TCACAGAACA TGTCCAACAA CAGCCCCGAG TATGCTTTGG TTTTCACCAT 
51 CTCGGGTGCT ATGGCCACCA TGGTCTCCAG TGGCCTGGGT GCTGCCTGTG 
101 GCATGGCCAA GAATGGCACC GGCATCATGG CCATGTCTGT CATGTGGCCA 
151 GAGCTGATCC ACATGAAGTC CATCATCCCA GTGGTCATGG CTGGTATCAT 
201 CACCATCTAT GGCCTAGTGG CGGCTGTCCC CCCTGCCAAC TCCCTGAATG 
251 ATGACAACAG TCTCTATAGC AGTTTCCTCC AGCTGGGCGC TGGCCTGAGT 
301 GGCCTGGCAG CCGGCTTTGC CATCGTCATC GTGGGGGACA CTGGCAAGTG 
351 TGGCACTGCC CAGCAGCCCC GACTATTTGT AGGCATGATA CTGATCCTCA 
401 TCTTTGCCAA GGTGCTCATT CTCTCCACAA AGCAGCCCCT CTCAAAACCC 
451 ACCAGTCACA GAATACGATG TAAAGACCAC CCCTCCTCAT TCCGGAACAA 
501 ACAGCCTGAC ACGCATGTGC TGGGCAGCTG GCCCTCAGTA GTTGATCTTC 
551 TAAGTGTACA GTGTCCTCGT GTTCATCGTC TGTTGGCCAG GCCTTGCCCC 
601 CTCCCGCCCC ATGCTGTGGA CATCTGAACC TAG 
(SB} ID N0:1) 



FEATURES: 



5'UTR: 1-9 

Start Codon: 10 

Stop codon: 625 

3'UTR: 628 




FIGURE 1A 



Docket No.: CL000651 
Serial No.: 09/727,770 
Inventors: LI, Znenya et al. 
Title: ISOLATED HUMAN TRANSPORTER., 



HOMOLOGOUS PROTEINS: 

TOP BLAST Hits: 



gi 
gi 
gi 
gT 
gT 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gi 
gn 
gi 



45023131 ref|NP_001685.1| ATPase, H+ transporting, lysosomal . 
137477|sp|P23956|VATL_BOVIN VACUOLAR ATP SYNTHASE 16 KD PROT. 
227919 1 prfM 1713409a h ATPase 16K [Bos taurus] 
3024812 I sp 1018882 I VATL_SHEEP VACUOLAR ATP SYNTHASE 16 KD PRO, 
418179|sp|Q03105|VATL_TORMA VACUOLAR ATP SYNTHASE 16 KD PROT. 
67531441 ref|NP_033859.1| ATPase-like vacuolar proton channel. 
67954|pir I IPXB0V6 H+-transporti ng ATPase (EC 3.6.1.35), vacu. 
1374781 spl P23380|VATL_DROME VACUOLAR ATP SYNTHASE 16 KDA PRO. 
3334403 I sp I 016110 I VATL^EDAE VACUOLAR ATP SYNTHASE 16 KD PRO. 
1718095|sp|P55277|VATL_HELVI VACUOLAR ATP SYNTHASE 16 KD PRO. 
4013341 spl P31403|VATL_MANSE VACUOLAR ATP SYNTHASE 16 KD PROT. 
10442628|qb|AAGl7394.1|AF277150_l (AF277150) V-ATPase 16 kD . 
7294725|gB|AAF50062.1| (AE003544) CG7547 gene product [Droso. 
2493142|sp|Q262 50|VATL_NEPNO VACUOLAR ATP SYNTHASE 16 KD PRO. 
251354|qb|AAB22509.1| vacuolar H(+)-ATPase proteolipid subun. 
2493143Tsp|Q00607| VATL_CANTR VACUOLAR ATP SYNTHASE 16 KD PRO. 



BLAST to dbEST: 

gi I 9336427 /dataset=dbest /taxon=960. 
gi I 6359805 /dataset=dbest /taxon=9606 
gi I 9134224 /dataset=dbest /taxon=9606 
gi 1 10219114 /dataset=dbest /taxon=96. 
gi 19347217 /dataset=dbe5t /taxon=960. 
gi 19152104 /data5et=dbest /taxon=9606 
gi 19894156 /dataset-dbest /taxon=960. 



EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 

Expression information from BLAST dbEST hits: 



gi 
gi 
gi 
gi 
gi 
gi 
gi 



9336427 Human uterus 

6359805 Human fetal liver 

9134224 Human brain 

10219114 Human lung 

9347217 Human placenta 

9152104 Human skin 

9894156 Human ovary 



Expression information from PCR-based tissue screening panels 

Human Bone marrow 

Human Brain 

Human Colon 

Human Fetal Brain 

Human fetal heart 

Human Fetal Kidney 

Human fetal liver 

Human Heart 

Human Kidney 

Human Liver 

Human Lung 

Human Pancreas 

Human Placenta 

Human Prostate 

Human skeletal Muscle 

Human Small Intestine 

Human Spleen 

Human Testis 
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1 MSNNSPEYAL VFTISGAMAT MVSSGLGAAC GMAKNGTGIM AMSVMWPELI 
51 HMKSIIPVVM AGIITIYGLV AAVPPANSLN DDNSLYSSFL QLGAGLSGLA 
101 AGFAIVIVGD TGKCGTAQQP RLFVGMILIL IFAKVLILST KQPLSKPTSH 
151 RIRCKDHPSS FRNKQPDTHV LGSWPSVVDL LSVQCPRVHR LLARPCPLPP 
201 HAVDI 
(SBQ 3D N0:2) 

FEATURES: 

Functional domains and key regions: 

[1] PDOCOOOOl PSOOOOl ASN_GLYCOSYLATlONN-glycosylation site 
35-38 NGTG 



[2] PDOC00005 PS00005 PKC_PHOSPHO_SlTEProtei n kinase C phosphorylation site 
Number of matches: 4 

1 111-113 TGK 

2 139-141 STK 

3 149-151 SHR 

4 160-162 SFR 



[3] PDOC00006 PS00006 CK2_PH0SPH0_SlTECasei n kinase li phosphorylation site 
Number of matches: 2 

1 78-81 SLND 

2 176-179 SVVD 



[4] PDOC00008 PS00008 MYRlSTYLN-myri stoyl ati on site 
Number of matches: 8 



1 


16-21 


GAMATM 


2 


25-30 


GLGAAC 


3 


27-32 


GAACGM 


4 


31-36 


GMAKNG 


5 


68-73 


GLVAAV 


6 


93-98 


GAGLSG 
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98-103 


GLAAGF 
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172-177 


GSWPSV 
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Membrane spanning structure and domains: 

Helix Begin End Score Certainty 

1 14 34 1.889 certain 

2 37 57 0.733 Putative 

3 60 80 2.030 Certain 

4 95 115 1.775 Certain 

5 127 147 1.699 Certain 
BLAST Alignment to Top Hit: 

>gi 145023131 ref I NP_001685.1 I ATPase, H+ transporting, lysosomal 
(vacuolar proton pump) 16kD 

>gi |137479|sp|P27449|VATL_HUMAN VACUOLAR ATP SYNTHASE 16 
KD PROTEOLIPID SUBUNIT >gi | 107394 | pi r | | A39367 
H+-transporting ATPase (EC 3.6.1,35) chain PKDl - human 
>gi |189676|gb|AAA60039.1| (M62762) vacuolar H+ ATPase 
proton channel subunit [Homo sapiens] 
Length = 155 



score = 181 bits (455), Expect = 5e-45 

Identities = 110/153 (71%), Positives = 114/153 (73%), Gaps = 14/153 (9%) 

Query: 2 SNNSPEYALVFTISGAMATMVSSGLGAACGMAKNGTGIMAMSVMWPELIHMKSIIPVVMA 61 

S + PEYA F + GA A MV S LGAA G AK+GTGI AMSVM PE I MKSIIPVVMA 
Sbjct: 4 SKSGPEYASFFAVMGASAAMVFSALGAAYGTAKSGTGIAAMSVMRPEQI-MKSIIPVVMA 62 

Query: 62 GIITIYGLVAAVPPANSLNDDNSLYSSFLQLGA GLSGLAAGFAIVIVGDTGKCGTA 117 

GII lYGLV AV ANSLNDD SLY SFLQLGA GLSGLAAGFAI IVGD G GTA 
Sbjct: 63 GIIAIYGLVVAVLIANSLNDDISLYKSFLQLGAGLSVGLSGLAAGFAIGIVGDAGVRGTA 122 



Query: 



118 QQPRLFVGMILILIFAKV 

QQPRLFVGMILILIFA+V 

Sbjct: 123 QQPRLFVGMILILIFAEVLGLYGLIVALILSTK 15 5 (SBQ ID N0:4) 



-LILSTK 141 
LILSTK 



Hmmer search results (Pfam): 

Model Description Score E-va1ue N 

PF00137 ATP synthase subunit C 14.8 0.028 2 

Parsed for domains: 

Model Domain seq-f seq-t hmm-f hmm-t score E-value 

PF00137 1/2 12 73 1 65 [, 7.6 2.4 

PF00137 2/2 89 133 . . 1 53 [. 14.6 0.031 
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1 GCTGTGGGGC CAGGAAAAGG AGAGAAGGTG AAACCCCCGT CAGTCCCTCA 
51 CAATCAGCAC GTGGAAATCT AGAAATGCAG GAGAGGCCTG GACTCATGGT 
101 GGAATCCAGA ATGAAAGAGG TGGACGACTG AATGAGCAGA AGGAGGCAAG 
151 CACCAGAGGC TTGGGGGTCA CATTTCTTGG AAGTGGCCTG GAGCTGGCAG 
201 ATGAGAACTC TGGCTACCCG TCCTCATTCC ACTAACAGTA GCTCCTCTAA 
251 CGACATGCCC CTTCCCTCTG TACCCCGCTC CGCATGCGGC AAGTAGTTCC 
301 CGGACGCGAC CCTTCCCCCT GTACCCCGCT CCGCATGGGG CCAGTAGTTC 
351 CCGGACGCGC CCCTTCCCTC TGTACCCCGC TCCGCATGCG GCAAGTAGTT 
401 CCCGGACGCG CCCCTTCCCT CTGTACCCGG CTCCGCATGC GGAAAGTAGT 
451 TCCTACGGTG TTGGTTTTGC ATGTAGATGA AACCCTTTGA GGGGTAAAGG 
501 II I 11 M I II AAGTACTTTA GCAAATGCAA ACTGTTATTA TCAATATTAG 
551 CCAGCATCTT I I I I I I I I I I I I I I I I I I I I I I I M GAGAT GGAGTTTCGC 
601 TCTTGTCACC CAGGCTGGAG TGCAATGGCA AAATCTAGGC TCACTGCAAC 
651 CTCCGCCTCC CAAGTTCAAG CGATTCTCCT GCCTCAGCCT CCCAGGTAGC 
701 TGGGATTACA GGCGTGTGCA ACCACACCCA GCTAA I I II I GTATTTTTAG 
751 TAGAGACAGG GTTTCACCAT GTTGGCCAGG CTGGTCTCGA ACTCCTGACC 
801 TCATGTGATC CATCCGCCTC AGCCTCCCAA AGTGCTGGGA TTACGTAGCC 
851 AGTGTCTTTC TTAAGTGCCT GTCAAATAAT GCTCCTGGTT TATAAGTGCC 
901 CCTGGCTCTA CCTTCTGGGT GCTCAGACAC CAACACAGAG AGAACAGAAT 
951 TAACATCCTG AGAAGTTACA TATGCTAAAA TATAAAGAGT AAGATTGTGA 
1001 GGAAACTGCA GGGGAAGCAG GTAGGTTAGG AAAAGGTATC CTCACTTTTC 
1051 TGCTGACCGA TGAGTCATAA TTCTTGAATT TCGGTGCTGG AAAGGTCCAT 
1101 TAAGCATTCC AGGAGATTCT AGGGAGCTTC CAGAATGGTA GAAGAACTGG 
1151 AACCATAAAG CCTGGGGAAG GGATGGAAGT CCTTGGGAAA GAAGCACTAA 
1201 ACAGCCCAGT GGAGACAAGG AAGGACTGGT CTCTCCTGTG CTTCGAGCCC 
1251 AGCAATGATT ATTCACTCAG ATATGCCCCG GCAGGTCCTG CTGCTAGAGC 
1301 CAGTGCTGTT CCCAGACCCA GGCAAGGTGC CATCCTACCC CTGACAGGAA 
1351 ACAGGGCAGG AGGTGGGGCT GCCCCGGGTG CCTGGTGTTG GGAGGGGCCG 
1401 GGGGGAATCC CGGGTGTTGG GAGGACAAGG CAGAGTCAGC TAGCTGTGAG 
1451 GCTAGGGGAG AAGACCTCTC TAGTCTGGGA GAGACCCCTC CTTTCCTAGC 
1501 TCCTTGTACT TCCAAAAAGG CAGGCTTCCT GCTGTTACTA ACCATACCAG 
1551 GACTGACTAT ACAGCAGCCA GAAAGATTCT GAGAAACCTG TGATAGAGAA 
1601 AAACAGATGC GGAAGCGGGA GAAGAGAATT TCATAGGACA CTAGGGAAAG 
1651 AGAATGGGAA CTTGTGGTCT AAAGAGGGAA CCAAGTCTGG CCAACATGGT 
1701 GAAACCCCAT CTCTATTACA AATACAAAAA TTAGCTGGGC ATGGTAGTGC 
1751 ATGCCTGTAA TCCCAGCTAC TCAGGAGGGT AAGGCATGAG AATCACTTGA 
1801 GACTGGGAGG CGGAGGTTGC AGTGAGCCGA GACTGCACCA CTGCACTCCA 
1851 GCCTGGGCAA CAGAGCAAGA CCTCGTCTCA AAAAAAAAAA AAAATTAAAT 
1901 TTAAATTAAA AAAAAATAAA CAGGGAACCA ACAAGAGCTG GCAGAACAGA 
1951 ATAAAGTCTC AAGCCAAATA ACTCCCTTGC CTTGGAAGAA CAAGGCTGCC 
2001 AGCGTCTTGG AGCCTCTGTT TATCGGGTAC CAGTTCAAAG GACAGTGAGC 
2051 CTGAGCTGGC CTGGGAGGCC CTCCCCCCTC CCAGATGAAA ACAATAGGCC 
2101 TGTTTTCCTG AGCTCTTCCT GTAATCCAGA AGGCCCACAC AGAGAGGAAG 
2151 AGGGGGGCAA AGGCAGTGGC TATACCCAGT GGGGGAGGGG ATATTTAGCC 
2201 TCCCATAAAT TCATCAGCTC CCTTAAAGAC ACCCCAAAAC ACCAACAATC 
2251 TAAGTGTTAA AATAGTGACT GCTATGCAAA TGGAGCTTTA AAACCTATCC 
2301 CTTAGCCCCA GTACCACCAG ATTACTAACC CTAAACCCCA TCTGTAGGAG 
2351 ATATTCTGAA GCCACCACAG GGGAAGGGAT AAGGGCCTGA GAGACAAAGG 
2401 ACAGATGGGG TCTCCCCAAC AATTTAAGTT AAGTTCCACA AGGATACAGT 
2451 ACTGGCAGAG ATTTGGAAGT AGGGGCAAGT ATTCTGACAG AAGGGTGGTG 
2501 TCTTAGGCAC CCTTCAATTA GGAGTAGCTA AAGGCTGTGT GTGTGTCTGT 
2551 GTGTGTGCAT AAGAAAAGAA ATAGGAGGGT GTGTGTGTGG TAAGAAAGAG 
2601 CATCTTGGCT GGGCGCGGTG GCTCACCCCT ATAATCGCAG CACTTTGGGA 
2651 TGCCAAGGCT GGCGGATTGC CTGAGCTCAG GAGTTTGAGA CCATACGGGG 
2701 CAACATGGTG AAACCCCATC TCTACTAAAA ATACAAAAAA TTAGCTGGGC 
2751 ATGGTGGTGC GTGCCTATAG TTCCAGCTAC TCGGGAGGCT GAGGCATGAG 
2801 AATGGCTTGA GCCCTGGAGG CAGAGGTTGA AGTGAGCTGA GATCGCACCA 
2851 TTGCATTCCA GCTTGGGCTA CAGAGTGACA CTCCATCTCA AAAAAAAAAA 
2901 AAAAAAAAAA AAAAACCAGC ATCTTTGCTG CCACTAGTCC ACTGTCTTTG 
2951 CACTCACTCT CTGCCATGCC CATCCTTGTC CCCCTCCCCA CTCACAGACA 
3001 TGTCCAACAA CAGCCCCGAG TATGCTTTGG TTTTCACCAT CTCGGGTGCT 
3051 ATGGCCACCA TGGTCTCCAG TGGCCTGGGT GCTGCCTGTG GCATGGCCAA 
3101 GAATGGCACC GGCATCATGG CCATGTCTGT CATGTGGCCA GAGCTGATCC 
3151 ACATGAAGTC CATCATCCCA GTGGTCATGG CTGGTATCAT CACCATCTAT 
3201 GGCCTAGTGG CGGCTGTCCC CCCTGCCAAC TCCCTGAATG ATGACAACAG 
3251 TCTCTATAGC AGTTTCCTCC AGCTGGGCGC TGGCCTGAGT GGCCTGGCAG 
3301 CCGGCTTTGC CATCGTCATC GTGGGGGACA CTGGCAAGTG TGGCACTGCC 
3351 CAGCAGCCCC GACTATTTGT AGGCATGATA CTGATCCTCA TCTTTGCCAA 
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3401 GGTGCTCATT CTCTCCACAA AGCAGCCCCT CTCAAAACCC ACCAGTCACA 

3451 GAATACGATG TAAAGACCAC CCCTCCTCAT TCCGGAACAA ACAGCCTGAC 

3501 ACGCATGTGC TGGGCAGCTG GCCCTCAGTA GTTGATCTTC TAAGTGTACA 

3551 GTGTCCTCGT GTTCATCGTC TGTTGGCCAG GCCTTGCCCC CTCCCGCCCC 

3601 ATGCTGTGGA CATCTGAACC TACTCATCAC CCATCCAGGT CCCCGACCAG 

3651 TGAGGACTCA GGCCCCTGGA TGCCCCACCC ATCTCCCTTG AGTACTCTAT 

3701 GTATAAGGAT GAATTAGAGT TGTCATTTTC TCTTCATTAG ATATTTATAA 

3751 AGATTTGGCC TGTCCATACC CCTGTGGAGC AGCCCTCATC TCCCACCTAT 

3801 CTGTCACGTC ATGGAGGTTC CCATTGCGGA GGCTCCTTGG ATGGAACCAC 

3851 CCTCTCCAGC CCGCGCTGCC AGGCCCTGTG CGGCAGCTGT GTCTGATAAA 

3901 GTTCTCAGAT GTCGGGGGAG GGAAAGAAAA AAAAAAGAGA GTGTGAGTAC 

3951 GTAAGAGAGA GAAAACGGGA GTGGGTGTGT GAGCTGGAGA CAGGGAAGTG 

4001 GCAGGAAAAG TCTGATAAGA TCACCTCCTT CCTACCCAAG CAGAGATACT 

4051 GGACACAGCC CCTCAAGGAC CCAGAGGGTA AGTAGAGCGC GAGATGCTTG 

4101 CCTTTCTCAA TGGGAGGTGG CCTCCCAGGC CTGAAGAAGT CTCCATTTAC 

4151 CCCAGAGCCA ACTAGGAAGC AGGTAGACAG CATCATCCCC ACTTATACCC 

4201 CAAGGTGCTT GGGGTGAATG GCAGGCCCAA AGCCAAAGCA TGAGACAGAT 

4251 TAAATGTTCC TATGGCGAGA GAAGGAGAAG GGGTCACCAG CATCTCTCCA 

4301 CTGAGCAAAT GAAAGGAAGA GAGAAGGCAG GCTGATACCC TCATCAATTT 

4351 CCTACTGTCC ATGATATACC ACCATCAACT GGAC II II i I TTTTTTTTTG 

4401 AGATAGAGTC TCGCTTTTGT CACCCAGGCT GGAGTGCAGT GGCATGATCT 

4451 CAGCTCACTG CAACTTCCAT CTCCCAGGTT CAAGTGATTC TCCCGCCTCA 

4501 GCCTCCTGAG TAGCTGGGAT TACAGGTGCC TGCTACCACA TCCAGCTGAT 

4551 I I I I I I I GTA I I I M AGTAG AGATGGGGTT TCTTTCTTTT II M I I! I I I 

4601 TTTTGAGACG GAGTCTTGCT CTGTCGCCCA GGCTGGAGTG CAGTGGCGCG 

4651 ATCTCGGCTC ACTGCAACCT CCGCCTCCCA GGTTCACGCC ATTCTCCTGC 

4701 CTCAGCCTCC CGAGTAGCTG GGACTACAGG CACCTGCCAC CACACTCGGC 

4751 TAA I I I I I I G TATGTTTAGT AGATATGGGG TTTCACTGCT GTCTCAACCT 

4801 TCTGACCTCA TGATCCGCCC GCCTCGGCCT CCCAAAGTGC TGGGATTACA 

4851 GGCATGAGCC ACTGTGCCCG GCC I I I I I I I TTTTTTTTGA GATGGAGTCT 

4901 CGCTCTGTCG CCCAGGCTGG AGTGCAATGC CACAATCTCA GCTCACTGCA 

4951 AGCTCCACCT CCGAGGTTCA CGCCATTCTC CTGCCTCAGC CTCCTGAGTA 

5001 GCTGGGACTA CAGGCGCCCG CCACCACGCC CAGCTAATTT TTTGTATTTT 

5051 TAGTAGAGAC GGGGTTTCAC CTTGTTAGCC AGGATGGTCT TGATCTCCTG 

5101 ACCTCGTGAT CCACCTGCCT CAGCCTCCCA AAGTGCTGGG ATTACAGGTG 

5151 TGAGCCACCA TGCCTGGCCT I I I I I I I I I I TTTAAGACAG GAGTGTGGTG 

5201 GCACAATCTC AGCTCACTGC AACCTCCCCT TCTAGGTTCA AGCAATTCTC 

5251 CTGCCTCAGC TTCCTAAGTA TAGTAATAGC TGGGACTATA GGCGCCCACC 

5301 ACCACGCCCG GCTAATCTTT TGTATTTTTA GTAGAGATGG GGTTTCACCA 

5351 TGTTGGCCAG GCTGGTCTCG AATTGCTGAC CTCAAGTGAT CTGCCCACCT 

5401 GGGCCTCCCA AAGTGCTGGG ACTATAGGCG GGAGCCACCG CGCCCAGCCT 

5451 GGACTCTTTT TAATGAAGCC TTCAAAAAAA CTCCTTTTCT CAGCGCTTCT 

5501 TACTCTCTGA AACAGACTCT CCACTCTGCT AACCCTGCCT CTCACACTGT 

5551 GGAACTCAAC CGGATCTTTT TATTCTGAAT CCACAACGTG AAGTACTTGT 

5601 CCTCTGTCTA TCGATGGCTA CCTGTGTTTT GAAGTGTTTT TATGGGAATG 

5651 AAGCACTGGA GGGGAGGAAA TCAGGCCAGT TCTAGAAGTA GAAGGAAGGC 

5701 GAAGAAACCA GGAAAAATAT TTATGTGATG GGAGGAAAGG CAGTTTATAA 

5751 ATCACTCATG GATCTCTATG CCAGAGGGAT GTGTGAGACA CACGCATGCA 

5801 CACACACACT GACTTGCAGG TACATGCAGA GGCAGAAACA AGTCAGGACA 

5851 TGACACATAC ATGAATACAC ATACCATTCT CATCAGAAAC CAGTCAGAGC 

5901 AGAGGGGCCC TGCCTGGAGC AAGGAGACTG GAATTTATTC CCCTCCTCCT 

5951 CTCAAAGGGT AATTTTGCTG CCTCCATGTC TAGGTTCCCC ACAGATCTGG 

6001 CTGCCTCAGA CAGGGGCCCT GGTCTGGTGG CTGGACTCAG CCTGGAGGTC 

6051 TTCACAGATG GAGGCCTATA AGAGGTGGCA GCTGACACCT GGAGGGAGCT 

6101 GGATGAAAGC AGGCAGTGCA GAGTAGAGAA AGCCAGGTGG TGGGGGAGGG 

6151 AGTGAGGGAG AAGAGGGGAC CAGATTCAAG CAGCCTTGCG CTGGTTCTAA 

6201 AATGGCCACA GCAAGGCAAC GGACAGATGG TCCCTTTCTG ATGCTGAGCC 

6251 GGGGAAGTGG GGAAAGGGAA AAGGAAAAAA TAAACACCAT CACAGTCAGA 

6301 AATTTAAAAA TAAACTGAAA AACCTAAAAA ATAAACCGT 

(SBQ ID N0:3) 

FEATURES: 

Start: 3000 

Exon: 3000-3614 

Stop: 3615 



CHROMOSOME MAP POSITION: 
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Bac accession 
Chromosome 17 



number: AC005973 



ALLELIC VARIANTS (SNPs): 

DNA 

Position Major Minor 



Domai n 



559 
3638 
5446 
5808 
5892 
6071 

context: 

DNA 

Position 
559 



3638 



5446 



5808 



G 
C 
A 
A 
A 



A 
C 
T 
G 
C 
G 



Beyond 0RF(5') 
Beyond 0RF(3') 
Beyond 0RF(3') 
Beyond 0RF(3') 
Beyond 0RF(3') 
Beyond 0RF(3') 



CCCTTCCCTCTGTACCCCGCTCCGCATGCGGCAAGTAGTTCCCGGACGCGACCCTTCCCC 
CTGTACCCCGCTCCGCATGGGGCCAGTAGTTCCCGGACGCGCCCCTTCCCTCTGTACCCC 
GCTCCGCATGCGGCAAGTAGTTCCCGGACGCGCCCCTTCCCTCTGTACCCGGCTCCGCAT 
GCGGAAAGTAGTTCCTACGGTGTTGGTTTTGCATGTAGATGAAACCCTTTGAGGGGTAAA 



5892 



GG I I I I I I I I I I AAGTACTTTAGCAAATGCAAACTGTTATTATCAATATTAGCCAGCATC 
[-,A,T] 

I I I I II I I II I II I I II II II I I M I GAGATGGAGTTTCGCTCTTGTCACCCAGGCTGGA 
GTGCAATGGCAAAATCTAGGCTCACTGCAACCTCCGCCTCCCAAGTTCAAGCGATTCTCC 
TGCCTCAGCCTCCCAGGTAGCTGGGATTACAGGCGTGTGCAACCACACCCAGCTAATTTT 
TGTAI I I I lAGTAGAGACAGGGTTTCACCATGTTGGCCAGGCTGGTCTCGAACTCCTGAC 
CTCATGTGATCCATCCGCCTCAGCCTCCCAAAGTGCTGGGATTACGTAGCCAGTGTCTTT 

GTGTGGCACTGCCCAGCAGCCCCGACTATTTGTAGGCATGATACTGATCCTCATCTTTGC 
CAAGGTGCTCATTCTCTCCACAAAGCAGCCCCTCTCAAAACCCACCAGTCACAGAATACG 
ATGTAAAGACCACCCCTCCTCATTCCGGAACAAACAGCCTGACACGCATGTGCTGGGCAG 
CTGGCCCTCAGTAGTTGATCTTCTAAGTGTACAGTGTCCTCGTGTTCATCGTCTGTTGGC 
CAGGCCTTGCCCCCTCCCGCCCCATGCTGTGGACATCTGAACCTACTCATCACCCATCCA 
[G,C] 

GTCCCCGACCAGTGAGGACTCAGGCCCCTGGATGCCCCACCCATCTCCCTTGAGTACTCT 
ATGTATAAGGATGAATTAGAGTTGTCATTTTCTCTTCATTAGATATTTATAAAGATTTGG 
CCTGTCCATACCCCTGTGGAGCAGCCCTCATCTCCCACCTATCTGTCACGTCATGGAGGT 
TCCCATTGCGGAGGCTCCTTGGATGGAACCACCCTCTCCAGCCCGCGCTGCCAGGCCCTG 
TGCGGCAGCTGTGTCTGATAAAGTTCTCAGATGTCGGGGGAGGGAAAGAAAAAAAAAAGA 

AGGTGTGAGCCACCATGCCTGGCC I i I I I I I I I I I II I AAGACAGGAGTGTGGTGGCACA 
ATCTCAGCTCACTGCAACCTCCCCTTCTAGGTTCAAGCAATTCTCCTGCCTCAGCTTCCT 
AAGTATAGTAATAGCTGGGACTATAGGCGCCCACCACCACGCCCGGCTAATCTTTTGTAT 
TTTTAGTAGAGATGGGGTTTCACCATGTTGGCCAGGCTGGTCTCGAATTGCTGACCTCAA 
GTGATCTGCCCACCTGGGCCTCCCAAAGTGCTGGGACTATAGGCGGGAGCCACCGCGCCC 
[C,T,A] 

GCCTGGACTCI I I I I AATGAAGCCTTCAAAAAAACTCCTTTTCTCAGCGCTTCTTACTCT 
CTGAAACAGACTCTCCACTCTGCTAACCCTGCCTCTCACACTGTGGAACTCAACCGGATC 
I I I I lATTCTGAATCCACAACGTGAAGTACTTGTCCTCTGTCTATCGATGGCTACCTGTG 
TTTTGAAGTG i I I i I ATGGGAATGAAGCACTGGAGGGGAGGAAATCAGGCCAGTTCTAGA 
AGTAGAAGGAAGGCGAAGAAACCAGGAAAAATATTTATGTGATGGGAGGAAAGGCAGTTT 

TGAAACAGACTCTCCACTCTGCTAACCCTGCCTCTCACACTGTGGAACTCAACCGGATCT 
TTTTATTCTGAATCCACAACGTGAAGTACTTGTCCTCTGTCTATCGATGGCTACCTGTGT 
TTTGAAGTG I I M I ATGGGAATGAAGCACTGGAGGGGAGGAAATCAGGCCAGTTCTAGAA 
GTAGAAGGAAGGCGAAGAAACCAGGAAAAATATTTATGTGATGGGAGGAAAGGCAGTTTA 
TAAATCACTCATGGATCTCTATGCCAGAGGGATGTGTGAGACACACGCATGCACACACAC 
[A.G] 

CTGACTTGCAGGTACATGCAGAGGCAGAAACAAGTCAGGACATGACACATACATGAATAC 
ACATACCATTCTCATCAGAAACCAGTCAGAGCAGAGGGGCCCTGCCTGGAGCAAGGAGAC 
TGGAATTTATTCCCCTCCTCCTCTCAAAGGGTAATTTTGCTGCCTCCATGTCTAGGTTCC 
CCACAGATCTGGCTGCCTCAGACAGGGGCCCTGGTCTGGTGGCTGGACTCAGCCTGGAGG 
TCTTCACAGATGGAGGCCTATAAGAGGTGGCAGCTGACACCTGGAGGGAGCTGGATGAAA 

AGTACTTGTCCTCTGTCTATCGATGGCTACCTGTGTTTTGAAGTG I I I I I ATGGGAATGA 
AGCACTGGAGGGGAGGAAATCAGGCCAGTTCTAGAAGTAGAAGGAAGGCGAAGAAACCAG 
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GAAAAATATTTATGTGATGGGAGGAAAGGCAGTTTATAAATCACTCATGGATCTCTATGC 
CAGAGGGATGTGTGAGACACACGCATGCACACACACACTGACTTGCAGGTACATGCAGAG 
GCAGAAACAAGTCAGGACATGACACATACATGAATACACATACCATTCTCATCAGAAACC 
[A,C] 

GTCAGAGCAGAGGGGCCCTGCCTGGAGCAAGGAGACTGGAATTTATTCCCCTCCTCCTCT 
CAAAGGGTAATTTTGCTGCCTCCATGTCTAGGTTCCCCACAGATCTGGCTGCCTCAGACA 
GGGGCCCTGGTCTGGTGGCTGGACTCAGCCTGGAGGTCTTCACAGATGGAGGCCTATAAG 
AGGTGGCAGCTGACACCTGGAGGGAGCTGGATGAAAGCAGGCAGTGCAGAGTAGAGAAAG 
CCAGGTGGTGGGGGAGGGAGTGAGGGAGAAGAGGGGACCAGATTCAAGCAGCCTTGCGCT 



6071 CCAGAGGGATGTGTGAGACACACGCATGCACACACACACTGACTTGCAGGTACATGCAGA 
GGCAGAAACAAGTCAGGACATGACACATACATGAATACACATACCATTCTCATCAGAAAC 
CAGTCAGAGCAGAGGGGCCCTGCCTGGAGCAAGGAGACTGGAATTTATTCCCCTCCTCCT 
CTCAAAGGGTAATTTTGCTGCCTCCATGTCTAGGTTCCCCACAGATCTGGCTGCCTCAGA 
CAGGGGCCCTGGTCTGGTGGCTGGACTCAGCCTGGAGGTCTTCACAGATGGAGGCCTATA 
[A,G] 

GAGGTGGCAGCTGACACCTGGAGGGAGCTGGATGAAAGCAGGCAGTGCAGAGTAGAGAAA 
GCCAGGTGGTGGGGGAGGGAGTGAGGGAGAAGAGGGGACCAGATTCAAGCAGCCTTGCGC 
TGGTTCTAAAATGGCCACAGCAAGGCAACGGACAGATGGTCCCTTTCTGATGCTGAGCCG 
GGGAAGTGGGGAAAGGGAAAAGGAAAAAATAAACACCATCACAGTCAGAAATTTAAAAAT 
AAACTGAAAAACCTAAAAAATAAACCGT 
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